Loss-less compression and decompression of bitmaps using strokes

ABSTRACT

A method of compressing a bitmap of a symbol includes dividing up the symbol into one or more strokes which include a number of parallel, laterally adjacent, continuous line segments, run-length encoding each stroke to form a stream of line codes for that stroke, where the stream of line codes provides absolute values for position and length of one line segment and relative values of position and length for the other line segments; and then presenting the streams of the line codes in sequence, as a set representing the symbol.

FIELD OF THE INVENTION

This invention relates generally to loss-less compression anddecompression of bitmaps, and in particular to a method of compressing abitmap of a symbol, a compressed bitmap of a symbol, a method ofdecompressing a compressed bitmap, and a computer for decompressing acompressed bitmap.

BACKGROUND OF THE INVENTION

It is known to compress bitmaps, for example using run-length andoutline techniques, so that less space is required to store them and sothat less time is required to transmit them at a particular transmissionrate. However, compression and decompression take time and processingpower. For example, the world-wide-web has generated a need to have arange of large banner fonts. These can be stored and transmitted asoutlines, which are compact, but a significant amount of processingpower is required to render or decompress them from this form.

SUMMARY OF THE INVENTION

The present invention is concerned with providing high compressionratios of bitmaps, and with providing fast decompression. The inventionis more particularly, but not exclusively, concerned with acompression/decompression scheme which enables compressed symbol filesto be decompressed with simple or cheap processors and dedicatedhardware, thus making the invention particularly useful forworld-wide-web/internet browsers or viewers.

In accordance with a first aspect of the present invention, there isprovided a method of compressing a bitmap of a symbol, in which thesymbol is considered as being made up of a plurality of strokes, andeach stroke is considered as being made up of a plurality of continuouslines, and comprising the steps of: run-length encoding each stroke toform a stream of line codes for that stroke; and presenting the streamsof the lines codes for the strokes in sequence as a set.

It will therefore be appreciated that the invention provides adevelopment of run-length encoding. However, the separation of thesymbol into a plurality of strokes removes the need to encode the whitespace between those strokes, as it can be inferred to be background. Bycontrast, with simple run-length encoding, such white space tends toproduce long variable-length runs which do not compress well.

In one example of the method, the step of run-length encoding eachstroke comprises the steps of: encoding each line or group of lines ofthat stroke by one of a plurality of encoding methods to form a linecode; and presenting the line codes as a stream in the order in whichthe lines appear in that stroke, at least some of the lines each beingencoded in dependence upon their position and length relative to thepreceding line in that stroke.

At least one of the encoding methods may produce such a line codecomprising: a control code indicative of that encoding method; and atleast one parameter value for that control code. Various of theseencoding methods may comprise the step of providing, as such parametervalues:

(a) a two-dimensional position of one end of the respective line; andthe length of that line; or

(b) a two-dimensional position of one end of the respective line; and aone-dimensional position of the other end of that line; or

(c) an offset between one end of the respective line and one end of thepreceding line; and a difference between the length of the respectiveline and that of the preceding line; or

(d) an offset between one end of the respective line and one end of thepreceding line; and an offset between the other end of the respectiveline and the other end of the preceding line; or

(e) a number of repeats of the previous line; or

(f) a number of lines; and for each line in that number an indication ofwhether or not each end of that line is offset by one pixel in the linedirection from the corresponding end of the preceding line (in thiscase, the encoding method may further include the step of providing, assuch a parameter value: an indication of whether the offset (if any) ofone end of each of the respective lines is in one direction or in theopposite direction).

At least one of the encoding methods may produce, as such a line code, aparameterless control code. For example, at least some of theparameterless control codes may each be:

(g) a predetermined function of: an offset between one end of therespective line and one end of the preceding line; and a differencebetween the length of the respective line and that of the precedingline; or

(h) a predetermined function of: an offset between one end of therespective line and one end of the preceding line; and an offset betweenthe other end of the respective line and the other end of the precedingline.

By suitable choice of these methods, specific features of symbols (orparts of them) can be taken into account to produce a compression ratiowhich is perhaps more than twice that which is achievable using simplerun-length encoding. Many symbols have a high degree of correlationbetween vertically adjacent horizontal pixel rows, and this can be takenadvantage of, particularly by methods (c) to (h), to produce highcompression ratios.

At least some of these encoding methods may be paired. For example:

(i) in the case of a pair of methods (c) above, the maximum offsetand/or difference for one of those two methods may be different to thator those for the other of those two methods; or

(j) in the case of a pair of methods (d) above, the or each maximumoffset for one of those two methods may be different to that or thosefor the other of those two methods; or

(k) in the case of a pair of methods (e) above, the maximum number ofrepeats for one of those two methods may be different to that for theother of those two methods.

It will therefore be appreciated that, by choosing the method with thesmaller maximum, if that is possible, to encode a particular line orgroup of lines, fewer bits are produced. Preferably the method furthercomprises the step of adding a header indicative of the number ofstreams in the set and the length of each stream.

In accordance with a second aspect of the present invention, there isprovided a compressed bitmap of a symbol, produced by the method of thefirst aspect of the invention.

In accordance with a third aspect of the present invention, there isprovided a compressed bitmap of a symbol which is considered as beingmade up of a plurality of strokes each of which is considered as beingmade up of a plurality of continuous lines, the compressed bitmapcomprising: for each of the strokes, a plurality of run-length encodedline codes each for one of the lines or a group of the lines in thatstroke, the line codes being arranged as a stream in the order in whichthe lines appear in that stroke, and the streams of the lines codes forthe strokes being arranged in sequence as a set. Preferably, the linescodes are encoded by a plug of encoding methods, at least some of thelines each being encoded in dependence upon their position and lengthrelative to the preceding line in that stroke.

In accordance with a fourth aspect of the present invention, there isprovided a method of decompressing a compressed bitmap according to thesecond or third aspect of the invention, and comprising the steps of:decoding each line code to form a respective line definition; andrendering the line defined by each line definition. Preferably, decodingof at least two of the streams of line codes temporally overlap eachother. In the case of decoding a parametered line code as mentionedabove, the decoding step may comprise the steps of determining thecontrol code in the parametered line code, determining the or eachparameter of that line code, and forming the line definition inaccordance with the value of the control code and the value of the oreach parameter. In the case of decoding a parameterless line code asmentioned above, the decoding step may comprise the step of forming theline definition in accordance the value of the control code.

In accordance with a fifth aspect of the present invention, there isprovided a computer which is programmed by software to perform themethod according to the fourth aspect of the present invention. Theinvention may therefore be applicable to a general purpose computerwhich is programmed by software to produce the technical advantages ofthe invention.

In accordance with a sixth aspect of the present invention, there isprovided a computer comprising: a processor; memory accessible by theprocessor for storing data including a compressed bitmap, according tothe second or third aspect of the present invention, and rendered symboldata; and a decompressor circuit, the decompressor circuit comprising:means for generating coordinate value pairs; and at least one strokeengine operable to read a stream of the line codes from the memory andto decode the line codes in dependence upon the coordinate value pairsto produce an output signal. The decompressor circuit may have aplurality of such stroke engines and further include an access arbiterfor arbitrating requests from the stroke engines to access the memory.The generating means may comprise: a first sequencer which is operableto step through values of one of the coordinates; a second sequencerwhich, for each value of said one coordinate, is operable to stepthrough values of the other coordinate; and a write controller which isoperable to address the memory in dependence upon the coordinate valuesand supply data to the memory in dependence upon the output signal(s).

BRIEF DESCRIPTION OF THE DRAWINGS

A specific embodiment of the present invention will now be described byway of non-limiting example with reference to the accompanying drawings,in which:

FIG. 1 shows a first stroke, comprising a pair of elements, produced bydecoding a first bitstream for forming a Times Roman lower case letter“r”;

FIGS. 2 & 3 shows second and third strokes produced by decoding secondand third bitstreams for forming the letter;

FIG. 4 shows the resulting letter produced by combining the strokes;

FIG. 5 is a schematic diagram of a computer forming an embodiment of theinvention; and

FIG. 6 is a schematic diagram showing more detail of one example of thedecompressor shown in FIG. 5, including in this example threestroke-engines.

DETAILED DESCRIPTION

In the example of the invention which is now to be described, a symbolfile consists of two parts, a simple header and a data-part consistingof one or more bitstreams.

The header defines how many bitstreams are present in the data-part andgives the length in bits of each bitstream. There is a maximum of 15(2⁴−1) bitstreams, and each bitstream can be a maximum of 4095 (2¹²−1)bits long. Numeric data is defined (within both the header and thedata-part) using a varying length representation with least-significantbits first, e.g. a 5 bit code is stored in the order bit 0, bit 1, etc.This is referred to in this specification “LSBF” binary.

Symbol metrics are not stored within the symbol file, but may becontained in a separate kerning/separation table pertinent to a set ofsymbol files.

Specifically, in the example, the header of the symbol file data isstructured as: the number of bitstreams, which is an unsigned 4 bitinteger; and, for each bitstream, the length of that bitstream as anunsigned 12 bit integer, so that an array of unsigned 12 bit integers isprovided.

The data part of the symbol file contains, in sequence, the bit data foreach of the bitstreams.

The pixel coordinate convention used in the example is that thecoordinate origin (0, 0) is the top left of a bounding box around thesymbol, with increasing X going to the right and increasing Y goingdownwards. The symbol is considered as drawn in ink on a background ofunspecified colour. For example, the background could be white ortransparent. The encoding only specifies where the ink of the symboloccurs. The rectangular width and height of the bounding box are eachnever greater than 255.

A bitstream holds a compressed representation of one or more strokes.Each stroke within the bitstream can be separated by sequential decodingof the bitstream.

The separation of the data into bitstreams matches the stroke-basedstructure of the symbol. There are at least the same number ofbitstreams as the maximum number of strokes that ever occurs across thesymbol. Therefore, each bitstream is intended to be directed at a‘stroke-primitive’ renderer that can function independently andconcurrently with the render operations caused by the other bitstreams(since the strokes preferably never overlap). The file structure allowsconcurrent access to multiple bitstreams within the file through thebitstream length information.

The bitstream bit data is organised as a stream of short control codes.Each control code is defined to be followed by zero or more parameterscodes, the number of parameters and their sizes being specified for eachcontrol code by the symbol format. Variable length integer encodingcontributes significantly to good compression.

True entropy coding of the control codes is not used in the examplegiven, but could probably extract further compression, although thevarying length of the various control codings is a form of entropy codeand was chosen from actual symbol statistics. The parameter rangescurrently overlap slightly for different control code representations ofthe same data. If desired, such overlap could be removed so as toimprove further the compression efficiency.

Strokes are encoded as being drawn downwards in the coordinateconvention (i.e. increasing Y). For control codes that affect only onerow the Y coordinate is assumed to always increase by 1 after performingthe action of that control code. There is also a set of repeatingcontrol codes that generate more than one row and therefore give goodcompression. The repeating codes cater for true repeats as well as forruns of very small (‘micro’) changes which are encoded in an efficient2-bit code. In both cases the multiple rows are assumed to start at thecurrent Y coordinate and extend downwards with increasing Y.

In the example, the bitstream control codes are as follows:

STROKE_START

This is a 5 bit unsigned integer of value 31 (11111) and it is alwaysfollowed by three parameters X, Y, W, where: X is an unsigned 8 bitinteger (i.e. 0 to 255); Y is an unsigned 8 bit integer; and W is anunsigned 8 bit integer, to give a line code of the form (STROKE_START,X, Y, W), or (31, X, Y, W). The parameters X, Y are the coordinates ofthe left edge of a horizontal ink run of length W which extends to theright of this coordinate. In the example, a stream of bitstream datamust always start with a STROKE_START code. A new stroke can be startedwithin a bitstream by using another STROKE_START code.

BIG_CHANGE

This is a 5 bit unsigned integer of value 30 (01111) and is alwaysfollowed by two parameters ΔX, ΔW, where: ΔX is a signed 7 bit integer(ie −128+127); and ΔW is a signed 7 bit integer, to give a line code ofthe form (BIG_CHANGE, ΔX, ΔW) or (30, ΔX, ΔW). The parameters ΔX and ΔWmodify the current X coordinate and length W of the run such that:X→X+ΔX; and W→W+ΔW.

MEDIUM_CHANGE

This is a 5 bit unsigned integer of value 29 (10111) and is alwaysfollowed by two parameters ΔX, ΔW, where: ΔX is a signed 4 bit integer(ie −16 to +15); and ΔW is a signed 4 bit integer, to give a line codeof the form (MEDIUM_CHANGE, ΔX, ΔW) or (29, ΔX, ΔW). The parameters ΔXand ΔW modify the current X coordinate and length W of the run suchthat: X→X+ΔX; and W→W+ΔW.

REPEAT

This is a 5 bit unsigned integer of value 28 (00111) and is alwaysfollowed by one parameter REPEAT_COUNT which is an unsigned 8 bitinteger to give a line code of the form (REPEAT, REPEAT_COUNT) or (28,REPEAT_COUNT). The parameter REPEAT_COUNT is the number of times torepeat the current run definition but with Y increasing at each repeat.

REPEAT_SMALL

This is a 5 bit unsigned integer of value 27 (11011) and is alwaysfollowed by one parameter REPEAT_COUNT which is an unsigned 4 bitinteger to give a line code of the form (REPEAT_SMALL, REPEAT_COUNT) or(27, REPEAT_COUNT). The parameter REPEAT_COUNT is the number of times torepeat the current run definition but with Y increasing at each repeat.

MICRO_CHANGE_RUN

This is a 5 bit unsigned integer of value 25 (10011) and is alwaysfollowed by the parameters MICRO_RUN_TYPE and MICRO_RUN_CHANGES and thena number of parameters RUN_CHANGES equal in number to the value of theparameter MICRO_RUN_CHANGES, where: MICRO_RUN_TYPE is an unsigned 1 bitinteger (i.e. 0=left or 1=right); MICRO_RUN_CHANGES is an unsigned 8 bitinteger; and RUN_CHANGES are each an unsigned 2 bit integer. There areMICRO_RUN_CHANGES run-change elements coding differential very smallRUN_CHANGES for a set of consecutive rows. The run-change codes aredifferent for the left and right cases, as follows:

TABLE 1 RUN_(—) RUN_(—) MICRO_RUN_TYPE CHANGES CHANGES (0 = left; 1 =right) (Decimal) (LSBF binary) ΔX ΔW 0 0 00 0 0 0 1 10 −1 1 0 2 01 −1 00 3 11 0 −1 1 0 00 0 0 1 1 10 0 1 1 2 01 1 0 1 3 11 1 −1

LITTLE_CHANGE

This is a 5 bit unsigned integer of value 0 to 24 (00000 to 00011)without any additional parameters and is used to encode small values forΔX and ΔW using: ΔX=(LITTLE_CHANGE/5)−2; and ΔW=(LITTLE_CHANGE % 5)−2,where % is the modulus (remainder) operator. Hence ΔX and ΔW can eachvary from −2 to +2 as follows:

TABLE 2 LITTLE_CHANGE ΔX ΔW  0 −2   −2    1 −2   −1    2 −2   0  3 −2  1  4 −2   2  5 −1   −2    6 −1   −1    7 −1   0  8 −1   1  9 −1   2 10 0−2   11 0 −1   12 0 0 13 0 1 14 0 2 15 1 −2   16 1 −1   17 1 0 18 1 1 191 2 20 2 −2   21 2 −1   22 2 0 23 2 1 24 2 2

In summary, therefore, the line codes used in the example are asfollows:

TABLE 3 Control Control Code Code (LSBF Secondary Code Name (decimal)binary) Primary Parameters Parameters STROKE_(—) 31 11111 X₈, Y₈, W₈None START BIG_(—) 30 01111 ΔX₇, ΔW₇ None CHANGE MEDIUM_(—) 29 10111ΔX₄, ΔW₄ None CHANGE REPEAT 28 00111 REPEAT_COUNT₈ None REPEAT_(—) 2711011 REPEAT_(—) None SMALL COUNT₄ MICRO_(—) 25 10011 MICRO_RUN_(—) For1 to RUN_(—) TYPE₁, MICRO_(—) CHANGE MICRO_RUN_(—) RUN_(—) CHANGES₈CHANGES: RUN_(—) CHANGES₂ LITTLE_(—) 0 to 24 00000 to None None CHANGE00011

In Table 3, the subscripts denote the number of bits of the respectiveparameters.

A specific example of decoding a symbol file will now be described withreference to Table 4, in which the following stream of 753 bits ofbinary data is divided up, analysed, and used to render the symbol shownin FIGS. 1 to 4:

11001110 01110000 01100111 00000011 11110000 11111111 01010001 1110001100000101 11101111 00001000 01001011 11011110 00010000 10010111 1011110000100001 00101111 01111000 01000010 01011110 11110000 01000110 0111110000001111 10110111 00000011 01111110 10100001 00011100 01100100 1100110000011011 11001111 11111110 11100011 11000001 00001011 11011011 0101110111001010 11101111 10010010 10010101 11011111 00100100 00101001 0001101001010011 00110000 01010100 01010011 11101100 01100111 10111010 0101100101000011 11101111 00010111 11010111 01001011 10111010 00011111 1111000010001100 10010110 00001010 00010000 10011000 00111011 00100000 1101110000011010 01101101 00000011 11001100 11001100 11001110 01111001 0110110110010100 10001101 01100001 01011010 01010010 10111011 10010101 1101110010011110 01111100 11010111 10011000 11011110 0

TABLE 4 Code No. of Bitstreams LSBF Binary Bits (decimal) HEADER 1100  4 3 Bitstream Length (decimal) 111001110000 12 231 011001110000 12 230001111110000 12 252 Header length 40 Code Line Code LSBF Binary Bits(decimal) Y ΔX ΔW X⁻ W X⁺ BITSTREAM 1 11111 11101010 00111100  29 31 8760 6 60 87  6  92 01100000 10111 1011 1100  13 29 −3 3 61 −3    3 84  9 92 00100  5 4 62 −2    2 82  11  92 00100  5 4 63 −2    2 80  13  9210111 1011 1100  13 29 −3 3 64 −3    3 77  16  92 00100  5 4 65 −2    275  18  92 00100  5 4 66 −2    2 73  20  92 10111 1011 1100  13 29 −3 367 −3    3 70  23  92 00100  5 4 68 −2    2 68  25  92 00100  5 4 69 −2   2 66  27  92 10111 1011 1100  13 29 −3 3 70 −3    3 63  30  92 00100 5 4 71 −2    2 61  32  92 00100  5 4 72 −2    2 59  34  92 10111 10111100  13 29 −3 3 73 −3    3 56  37  92 00100  5 8 74 −1    1 55  38  9200100  5 12 75   0    0 55  38  92 01111 1000000 1111101  19 30 1 −33 76  1 −33 56  5  60 10111 0000 0011  13 29 0 −4 77   0  −4 56  1  56 011111101010 0001000  19 30 43 8 78 43    8 99  9 107 11100  5 7 79 −1    098  9 106 01100  5 6 80 −1  −1 97  8 104 10011 0 01100000 11 01 11  2625 0 6 3 2 3 1 81   0  −1 97  7 103 10 01 11 2 3 82 −1    0 96  7 102 83  0  −1 96  6 101 84 −1    1 95  7 101 85 −1    0 94  7 100 86   0  −194  6  99 Bitstream length 231 Line length 449 BITSTREAM 2 1111111101110 00111100  29 31 119 60 8  60 119  8 126 00010000 10111 10110110  13 29 −3 6  61 −3    6 116  14 129 10111 0111 0010  13 29 −2 4  62−2    4 114  18 131 10111 0111 1100  13 29 −2 3  63 −2    3 112  21 13210010  5 9  64 −1    2 111  23 133 10010  5 9  65 −1    2 110  25 13410111 0111 1100  13 29 −2 3  66 −2    3 108  28 135 10010  5 9  67 −1   2 107  30 136 00010  5 8  68 −1    1 106  31 136 10010  5 9  69 −1   2 105  33 137 00010  5 12  70   0    0 105  33 137 10010  5 9  71 −1   2 104  35 138 10011 0 01100000 10 10 10  26 25 0 6 1 1 1 0  72 −1   1 103  36 138 00 10 10 1 1  73 −1    1 102  37 138  74 −1    1 101 38 138  75   0    0 101  38 138  76 −1    1 100  39 138  77 −1    1  99 40 138 01111 1011000 1100111  19 30 13 −13  78 13 −13 112  27 138 101110100 1011  13 29 −2 3  79   2  −3 114  24 137 00101  5 20  80   2  −2116  22 137 00001  5 16  81   1  −1 117  21 137 11110  5 15  82   1  −2118  19 136 11110  5 15  83   1  −2 119  17 135 00101  5 20  84   2  −2121  15 135 11110  5 15  85   1  −2 122  13 134 10111 0100 1011  13 29 2−3  86   2  −3 124  10 133 10111 0100 1011  13 29 2 −4  87   2  −4 126 6 131 Bitstream length 230 Line length 701 BITSTREAM 3 11111 1100001000110010  29 31 67 76 26  76 67  26  92 01011000 00101  5 20  77   2 −269  24  92 00001  5 16  78   1 −1 70  23  92 00001  5 16  79   1 −1 71 22  92 00110  5 12  80   0   0 71  22  92 00001  5 16  81   1 −1 72  21 92 11011 0010  9 27 4  82   0   0 72  21  92  83   0   0 72  21  92  84  0   0 72  21  92  85   0   0 72  21  92 00001  5 16  86   1 −1 73  20 92 10111 0000 0110  13 29 0 6  87   0   6 73  26  98 10011 0 1101000000 11 11  36 25 0 1 1 0 3 3 0  88   0   0 73  26  98 00 11 00 11 00 1100 11 3 0 3 0 3 0 3  89   0 −1 73  25  97  90   0 −1 73  24  96  91   0  0 73  24  96  92   0 −1 73  23  95  93   0   0 73  23  95  94   0 −173  22  94  95   0 −0 73  22  94  96   0 −1 73  21  93  97   0   0 73 21  93  98   0 −1 73  20  92 00111 00111100  13 28 60  99   0   0 73 20  92 100   0   0 73  20  92 101   0   0 73  20  92 102   0   0 73  20 92 103   0   0 73  20  92 104   0   0 73  20  92 105   0   0 73  20  92106   0   0 73  20  92 107   0   0 73  20  92 108   0   0 73  20  92 109  0   0 73  20  92 110   0   0 73  20  92 111   0   0 73  20  92 112   0  0 73  20  92 113   0   0 73  20  92 114   0   0 73  20  92 115   0   073  20  92 116   0   0 73  20  92 117   0   0 73  20  92 118   0   0 73 20  92 119   0   0 73  20  92 120   0   0 73  20  92 121   0   0 73  20 92 122   0   0 73  20  92 123   0   0 73  20  92 124   0   0 73  20  92125   0   0 73  20  92 126   0   0 73  20  92 127   0   0 73  20  92 128  0   0 73  20  92 129   0   0 73  20  92 130   0   0 73  20  92 131   0  0 73  20  92 132   0   0 73  20  92 133   0   0 73  20  92 134   0   073  20  92 135   0   0 73  20  92 136   0   0 73  20  92 137   0   0 73 20  92 138   0   0 73  20  92 139   0   0 73  20  92 140   0   0 73  20 92 141   0   0 73  20  92 142   0   0 73  20  92 143   0   0 73  20  92144   0   0 73  20  92 145   0   0 73  20  92 146   0   0 73  20  92 147  0   0 73  20  92 148   0   0 73  20  92 149   0   0 73  20  92 150   0  0 73  20  92 151   0   0 73  20  92 152   0   0 73  20  92 153   0   073  20  92 154   0   0 73  20  92 155   0   0 73  20  92 156   0   0 73 20  92 157   0   0 73  20  92 158   0   0 73  20  92 10110  5 13 159  0   1 73  21  93 11011 0010  9 27 4 160   0   0 73  21  93 161   0   073  21  93 162   0   0 73  21  93 163   0   0 73  21  93 10010  5 9 164−1   2 72  23  94 00110  5 12 165   0   0 72  23  94 10110  5 13 166   0  1 72  23  95 00010  5 8 167 −1   1 71  25  95 10110  5 13 168   0   171  26  96 10010  5 9 169 −1   2 70  28  97 10010  5 9 170 −1   2 69  30 98 10111 0111 0010  13 29 −2 4 171 −2   4 67  34 100 10111 0111 0010 13 29 −2 4 172 −2   4 65  38 102 01111 0011111 0001000  19 30 −4 8 173−4   8 61  46 106 01111 0101111 0011000  19 30 −6 12 174 −6 12 55  58112 11011 1100  9 27 3 175   0   0 55  58 112 176   0   0 55  58 112 177  0   0 55  58 112 Bitstream length 252 Line length 2353 COMPLETE SYMBOLTotal code length 753 Total line length 3503

As shown in the above table, the first four bits are taken to be thenumber of bitstreams in the file, and in the example have a value ofthree. Therefore: the next twelve bits (value 231) are taken to be thelength of the first bitstream; the next twelve bits (value 230), thelength of the second bitstream; and the next twelve bits (value 252),the length of the third bitstream. It is thus possible now to locate thestart of each bitstream in the file and process the three bitstreams inparallel if desired.

Considering now the first bitstream, the first five bits are taken to bea control code, and the control code has a value of 31. Referring toTable 3, this denotes STROKE_START, and the next 8, 8 and 8 bits (values87, 60, 6) are thus taken to be the parameters (X, Y, W) of theSTROKE_START code. A line is therefore rendered, as shown by theuppermost line in FIG. 1, having a Y value of 60, an X⁻(60) startingvalue of 87, a length W(60) of 6, and thus an X⁺(60) ending value of 92(=X⁻+W−1).

The next five bits are taken to be a control code, and the control codehas a value of 29. Referring to Table 3, this denotes MEDIUM_CHANGE, andthe next 4 and 4 bits (values −3, 3) are thus taken to be the parameters(ΔX, ΔW) of the MEDIUM_CHANGE code. A line is therefore rendered, asshown by the next line in FIG. 1, having a Y value of 61 (i.e. onegreater than the previous line), an X⁻(61) starting value of 84(=X⁻(Y−1)+ΔX), a length W(61) of 9 (=W(Y−1)+ΔW), and thus an ⁺X (61)ending value of 92 (=X⁻+W−1).

The remainder of the first bitstream is analysed and lines are renderedin a similar fashion until the end of the bitstream is reached, thusproducing a set of rendered lines as shown in FIG. 1. It should be notedthat the last control code of the first bitstream (for Y=81) has a valueof 25, denoting MICRO_RUN_CHANGE. Therefore, referring to Table 3, thenext bit (value 1) is taken to be the MICRO_RUN_TYPE, the next eightbits (value 6) are taken to be the number of RUN_CHANGES, and the nextsix pairs of bits (values 3, 2, 3, 1, 2, 3) are taken to be the codedvalues of those six RUN_CHANGES for the lines with Y values of 81, 82,83, 84, 85 and 86.

In a similar fashion, the second and third bitstreams are decoded inparallel with the first bitstream, or one after another, to produce thesets of rendered lines shown in FIGS. 2 and 3 respectively. It should benoted that the control code for the line 99 of the third bitstream has avalue of 28 denoting REPEAT. Therefore, the next eight bits (value 60)are taken to be the REPEAT_COUNT, i.e. the number of times that theprevious line (Y=98) is repeated. Accordingly, the values X⁻, W, X⁺ foreach of lines 99 to 158 are the same as those for line 98. It willtherefore be appreciated that this thirteen bit line code produces 1200bits of rendered stroke.

It will be appreciated that the lines shown in FIGS. 1 to 3 are renderedin the same memory with the same co-ordinate origin, and therefore incombination the three bitstreams produce a complete symbol as shown inFIG. 4.

It will be noted from the foot of Table 4 that the total length of thecode (including the header) to produce the lower case Times Roman “r” is753 bits. By comparison, if the symbol were presented as a raw bitmap,the length of the bitmap would be 65536 (=256²) bits, and thus theexample of the invention produces a lossless compression ratio of over87:1 compared with a raw bitmap. The upright rectangular area boundingthe example symbol and denoted by dashed lines in FIG. 4 has an area of(X⁺ _(max)−X⁻ _(min)+1).(Y_(max)−Y_(min)+1)=(138−55+1).(177−60+1)=9912bits. Therefore, if the symbol were presented as a partial bitmap,together with its origin (8+8 bits) and its width (8 bits), the lengthof the bitmap would be 9936 bits. Accordingly, even by comparison withsuch a partial bitmap, the example of the present invention produces asubstantial lossless compression ratio of over 13:1.

Having described a method of decoding the encoded symbol file, themethod of encoding a bitmap to produce such a symbol file is essentiallythe reverse of the decoding method, but in addition involves the stepsof: (a) determining how the symbol is to be divided up (if at all) intoa plurality strokes and (if at all) into a plurality of bitstreams; and(b) determining which control code to use when more than one can be usedto encode a line.

With regard to step “a” (splitting the symbol), there is no unique wayof splitting up the symbol, and optimum performance does depend on theway in which the decoder can decode the symbol file. For example, asingle bitstream may be used, with more than one STROKE_START controlcode being used as necessary to cope with the symbol including more thanone line having the same Y value. In the case where the decoder candecode only one bitstream at a time, this may provide optimumperformance. An advantage of dividing the data up into a plurality ofbitstreams, each with its own entry in the header, is that a decoderwhich is so capable, can decode the bitstreams in parallel. Somesymbols, such as a Times Roman “I”, do not need to be split intostrokes. If, however, it is divided in two, say the top and bottomhalves, with respective bitstreams, the total code length will beslightly greater due to the longer header, the additional STROKE_STARTcontrol code, and the need to re-establish the repeat for the lower halfof the main stem of the symbol. Therefore, the performance of a singlechannel decoder will be slightly reduced. However, the performance of adual channel parallel decoder will be almost doubled, although this isdependent on having a suitable decoder architecture to exploit thepotential performance increase.

Preferably, when splitting up the symbol, the resultant strokes do notoverlap.

With regard to step “b” (choosing the codes), various algorithms may beemployed so as to achieve a high compression ratio. For example, foreach line, the highest ranking of the following codes may be chosen ifit fulfills the stated condition:

7. REPEAT if it can be used for 32 or more lines; 6. REPEAT_SMALL if itcan be used; 5. MICRO_RUN_CHANGE if it can be used for five or morelines; 4. LITTLE_CHANGE if it can be used 3. MEDIUM_CHANGE if it can beused 2. BIG_CHANGE if it can be used 1. STROKE_START

It will be appreciated that many modifications and developments may bemade to the example described above. For instance, some of the linecodes mentioned above directly encode the change ΔW in the length of theline, or the length W of the line itself. Instead, these line codes maydirectly encode the change ΔX⁺ in the position of the right-hand end ofthe line, or the position X⁺ itself.

Also, in the example mentioned above, each line extends in the Xdirection. For some symbol sets, it may be that better compression canbe achieved by encoding lines which extend in the Y direction.Furthermore, different symbols, or different strokes in the same symbol,may be encoded with lines extending in the different directions, and,for example, one or more extra bits may be included in the header todefine the direction of the lines for the symbol or for each bitstream.

Furthermore, in the above example, each bitstream begins with aSTROKE_START control code. This control code is therefore redundant, andaccordingly it may be inferred, with only the values of the parametersof the initial STROKE_START control code being specified.

In the basic embodiment of the compression method as described, arepresentation of the symbol for a character of a fixed point size isheld. Ideally this representation is of such a size (in terms ofrepresented pixels) that it captures the detailed structure of eachcharacter representation without the storage of any extra unnecessaryinformation. However, many applications of character compression willneed a range of font sizes to be represented.

There are a number of ways to generate a number of point sizes from thestored compressed representation. The first and simplest method issimply to hold a number of compressed font descriptions at differentsizes and render the character from the appropriate one depending on thefont size required. This is inefficient in terms of memory usage, but itis simple and hence particularly applicable where only a limited numberof point sizes are needed. This approach also has the advantage that itallows for change of the character shape with point size, which may bedesirable for reasons relating to typography.

A second technique for generating a range of character sizes is to scaleeach character as it is decompressed, either enlarging or reducing asrequired. The specific details of such text scaling will generallydepend upon many characteristics of the application, such as theresolution of the resulting text in terms of number of pixels and thepoint sizes that are required to be generated. These affect the choiceof size for the base character representation and also affect the typeof scaling scheme used.

For example, a character may be represented on a 600 dpi grid. One‘point’ (a printing unit for text size) is approximately a 72nd of aninch and in this case corresponds to 8.33 pixels. Therefore a text pointsize of 48 in this case corresponds to 400 pixels, and this is theapproximate vertical extent of the tallest character. This size ofcharacter can be used as the compressed representation and then scaledby simple integer division on decompression to yield the following pointsizes:

integer divisor scaled point size 1 48 2 24 3 16 4 12 5 10

Only the one 48 point character is actually stored in compressed form,and the other sizes are generated dynamically from this compressedversion as required.

Scaling by integer factors can simplify the rendering of the scaled anddecompressed characters. For example, reducing the character size can bedone by counting decompressed pixels and outputting a rendered pixelwhen the counter reaches the divider, then resetting the counter andrepeating. Increasing the character size can be done by replication ofrendered pixels.

To implement integer scaling in this way some buffering will be requiredin the rendering hardware to hold a line of pixels such as to allow thescaling to occur in both dimensions. The rendered pixel value depends onhow many decompressed pixels with ink occur within each cycle of thecounters (considered in both dimensions, this corresponds to arectangular patch of decompressed pixels). It is also possible toinclude greyscale capability if needed by counting how many decompressedpixels with ink occur in each rendered pixel and using this to set agrey level.

Other efficient and more flexible scaling schemes based ondigital-differential-analyzers (DDA) can be used, or other techniques asused for efficient bitmap scaling can be employed.

An embodiment of an apparatus according to the present invention willnow be described with reference to FIGS. 5 and 6. Referring specificallyto FIG. 5, a computer such as a PC comprises, in known fashion, aprocessor 10 which communicates with a memory 12, input/output circuitry22 and with other devices via an address bus 14, data bus 16 and controlbus 18. The input/output circuitry can communicate with external systemsvia, for example, a telephone line 20. The computer also comprises adecompressor 24 which communicates with the memory 12 via acompressed-font memory interface 26 and a rendered-font memory interface28. If appropriate, these two memory interfaces may be in common.

Referring now to FIG. 6, the decompressor 24 includes a plurality ofstroke engines 30(0)-(2) (three in the embodiment shown) connected viaaddress and data buses and control lines to the compressed-font memoryinterface 26, the connection of the control lines being via an accessarbiter 31. The decompressor also includes a write controller 32connected via address and data buses and a control line to therendered-font memory interface 28 and also connected to a row sequencer34. The row sequencer 34 is connected to the stroke engines 30(0)-(2)via a Y-coordinate bus 36 and also individual control lines 38(0)-(2).The write controller 32 is connected to the stroke engines 30(0)-(2) viaan X-coordinate bus 40 and individual ink signal lines 42(0)-(2). Thestroke engines 30(0)-(2), access arbiter 31, write controller 32 and rowsequencer 34 of the decompressor 24 are implemented using logic arraysto perform in the manner described below.

In the embodiment, the decompressor 24 does not decode the header of asymbol file and only deals with the bitstream data. It is probablybetter to implement header decoding within the controlling hostprocessor software, as it occurs only once per symbol decompression.

The start addresses for the bitstream are pre-loaded by the hostprocessor 10 into the stroke engines 30(0)-(2) before decompressioncommences. The stroke engines 30(0)-(2) then individually pull bitstreamdata from the compressed font. The engines 30(0)-(2) asynchronouslyrequest the bitstream data, and the local arbiter 31 manages therequests so that one engine at a time is able to read, and so that eachhas equal access to the compressed data. In practice there could be moreor less of the stroke engines 30(0)-(2) depending on the particularimplementation, performance requirements and font set.

The row sequencer 34 steps down through the symbol rows. For each rowthe write controller 32 scans across the row in the X coordinate, apixel at a time, supplying the X coordinate values to the stroke engines30(0)-(2) via the X-coordinate bus 40. The stroke engines decode thebitstream data which has been received, and each stroke engine 30(0)-(2)sets its respective ‘INK’ output on line 42(0)-(2) high when, for thecurrent Y coordinate, the X coordinate falls within the active ink areadealt with by that stroke engine. If necessary, and depending on thememory word-width, the write-controller 32 gathers pixel data into wordsand then writes those words out to the memory 12. This can occur as theX coordinate is scanned to minimise counter logic. The start address forrendered data is pre-loaded into the write-controller 32 beforedecompression of a symbol.

A set of control and status registers (hooked into the logic of eachblock but not shown on the drawing) allow the host processor 10 to setup the decompressor and monitor its activity, detecting also when astroke or symbol is complete.

Typically the character representation will be decompressed and renderedas bitmap into an area of memory set aside as a font cache or temporarystore. This allows rapid bitmap moves to be used subsequently togenerate multiple characters either directly onto a display surface, orinto a framestore which is used to hold an image of the page or screenfor either display or printer applications. The details of memory usageand whether, or how, fonts are cached will therefore depend on theparticular application requirements. Such details can readily bedetermined by the man skilled in the art to meet the requirements of aparticular application.

The design described above contains only a small amount of fast logiccircuitry, in the write controller 32 and X-coordinate bus 40. The otherparts of the apparatus only update every row. The circuitry to generateeach INK output within the stroke engine is quite compact.

In an alternative implementation approach which is not shown in thedrawings, a large and fairly complex row-register sets INK bits inparallel for a particular stroke-engine with each stroke-engineaffecting the register in turn. The row is then built up in only Ncycles where N is the number of engines. However the row must still bewritten out to memory which still involves generating a sequence ofword-writes. The approach as shown in FIG. 5 probably uses less logicthan the register approach but at the expense of a higher clock-rate fora small part of the logic.

Although an apparatus has been described which uses a combination of adecompressor and a conventional PC, it will be appreciated that theapparatus of the invention may alternatively be implemented by softwareprogramming of conventional computer hardware.

What is claimed is:
 1. A method of lossless compression of a bitmap of asymbol, comprising: dividing up the symbol into one or more strokes,such that each stroke comprises a plurality of parallel, laterallyadjacent, continuous line segments; run-length encoding each stroke toform a stream of line codes for that stroke, each line code representingone of the line segments or a group of the line segments, wherein thestream of line codes provides absolute values for position and length ofa first line segment and relative values of position and length forother line segments, and presenting the streams of the line codes forthe strokes in sequence as a set representing the symbol; said step ofrun length encoding further comprising the steps of: encoding each lineor group of lines of that stroke by one of a plurality of encodingmethods to form a line code, wherein at least one of the encodingmethods produces such a line code comprising: a control code indicativeof that encoding method; and at least one parameter value for thatcontrol code; and presenting the line codes as a stream in the order inwhich the lines appear in that stroke, the lines each being encoded independence upon their position and length relative to the preceding linein that stroke.
 2. A method as claimed in claim 1, further comprisingthe step of adding a header indicative of the number of streams in theset and the length of each stream.
 3. A method of decompressing alosslessly compressed bitmap of a symbol, wherein the symbol has beendivided into a plurality of strokes each of which comprises a pluralityof parallel, laterally adjacent, continuous line segments, thecompressed bitmap comprising for each of the strokes, a plurality ofrun-length encoded line codes each for one of the line segments or agroup of the line segments in that stroke, each line code representingone of the line segments or a group of the line segments, wherein thestream of line codes provides absolute values for position and length ofa first line segment and relative values of position and length forother line segments, the line codes being arranged as a stream in theorder in which the line segments appear in that stroke, and the streamsof the lines codes for the strokes being arranged in sequence as a setrepresenting the symbol, the method comprising the steps of: decodingeach line code in each stream to form a corresponding line definitionfor one or more line segments for each line code; and rendering eachstroke by rendering the one or more line segments defined by each linedefinition, thereby rendering the losslessly compressed bitmap as asymbol, wherein each line or group of lines of a stroke has been encodedby one of a plurality of encoding methods to form a line code; and theline codes presented as a stream in the order in which the lines appearin that stroke, the lines each being encoded in dependence upon theirposition and length relative to the preceding line in that stroke, atleast one of the encoding methods producing such a line code comprisinga control code indicative of that encoding method; and at least oneparameter value for that control code, wherein in the case of decodingsuch a parametered line code, the decoding step comprises the steps ofdetermining the control code in the parametered line code, determiningeach parameter of that line code, and forming the line definition inaccordance with the value of the control code and the value of eachparameter.
 4. A method of decompressing a losslessly compressed bitmapof a symbol, wherein the symbol has been divided into a plurality ofstrokes each of which comprises a plurality of parallel, laterallyadjacent, continuous line segments, the compressed bitmap comprising foreach of the strokes, a plurality of run-length encoded line codes eachfor one of the line segments or a group of the line segments in thatstroke, each line code representing one of the line segments or a groupof the line segments, wherein the stream of line codes provides absolutevalues for position and length of a first line segment and relativevalues of position and length for other line segments, the line codesbeing arranged as a stream in the order in which the line segmentsappear in that stroke, and the streams of the lines codes for thestrokes being arranged in sequence as a set representing the symbol, themethod comprising the steps of: decoding each line code in each streamto form a corresponding line definition for one or more line segmentsfor each line code; and rendering each stroke by rendering the one ormore line segments defined by each line definition, thereby renderingthe losslessly compressed bitmap as a symbol, wherein each line or groupof lines of a stroke has been encoded by one of a plurality of encodingmethods to form a line code; and the line codes presented as a stream inthe order in which the lines appear in that stroke, the lines each beingencoded in dependence upon their position and length relative to thepreceding line in that stroke, at least one of the encoding methodsproducing as such a line code a parameterless control code, wherein, inthe case of decoding such a parameterless line code, the decoding stepcomprises the step of forming the line definition in accordance thevalue of the control code.
 5. A computer comprising: a processor; memoryaccessible by the processor for storing data including a losslesslycompressed bitmap of a symbol, wherein the symbol has been divided intoa plurality of strokes each of which comprises a plurality of parallel,laterally adjacent, continuous line segments, the compressed bitmapcomprising for each of the strokes, a plurality of run-length encodedline codes each for one of the line segments or a group of the linesegments in that stroke, each line code representing one of the linesegments or a group of the line segments; and a decompressor circuitcomprising: means for generating coordinate value pairs; and at leastone stroke engine operable to read a stream of the line codes from thememory and to decode the line codes into corresponding line segments toproduce an output signal corresponding to presence or absence of anyline segment of the symbol at a generated coordinate value pair, wherebya bitmap rendering of the symbol is provided by the decompressor circuitto the memory.
 6. A computer as claimed in claim 5, wherein thedecompressor circuit has a plurality of such stroke engines and furtherincludes an access arbiter for arbitrating requests from the strokeengines to access the memory.
 7. A computer as claimed in claim 5,wherein the generating means comprises: a first sequencer which isoperable to step through values of one of the coordinates; a secondsequencer which, for each value of said one coordinate, is operable tostep through values of the other coordinate; and a write controllerwhich is operable to address the memory in dependence upon thecoordinate values and supply data to the memory in dependence upon theoutput signal or signals.